Puji syukur ke hadirat Tuhan Yang Maha Esa, karena atas berkat dan rahmat-Nya, modul ajar "Machine Learning Lengkap dengan Praktikum" dapat diselesaikan dengan baik. Modul ini dirancang sebagai panduan komprehensif untuk mempelajari Machine Learning, mulai dari konsep dasar hingga implementasi praktis dengan berbagai bahasa pemrograman.
Machine Learning telah menjadi salah satu bidang paling penting dan berkembang pesat dalam dunia teknologi. Dari rekomendasi produk di e-commerce, pengenalan wajah di smartphone, hingga mobil otonom, Machine Learning telah mengubah cara kita berinteraksi dengan teknologi.
Modul ini disusun secara sistematis dengan pendekatan pembelajaran berbasis praktikum. Setiap materi dilengkapi dengan:
Kami berharap modul ini dapat membantu mahasiswa, pengajar, dan praktisi dalam menguasai Machine Learning. Kritik dan saran yang membangun sangat kami harapkan untuk penyempurnaan modul ini di masa mendatang.
Jakarta, Januari 2024
Tim Penyusun
Machine Learning (Pembelajaran Mesin) adalah cabang dari kecerdasan buatan (Artificial Intelligence) yang memungkinkan sistem untuk belajar dan meningkatkan kinerja dari pengalaman tanpa diprogram secara eksplisit.
| Aplikasi | Deskripsi |
|---|---|
| Rekomendasi Produk | Amazon, Netflix, Spotify merekomendasikan item berdasarkan preferensi pengguna |
| Pengenalan Wajah | Face ID di iPhone, tagging otomatis di Facebook |
| Spam Filter | Gmail mendeteksi dan memfilter email spam |
| Asisten Virtual | Siri, Google Assistant, Alexa memahami perintah suara |
| Deteksi Penipuan | Bank mendeteksi transaksi mencurigakan |
| Tahun | Perkembangan |
|---|---|
| 1950 | Alan Turing mengembangkan "Turing Test" untuk menguji kecerdasan mesin |
| 1952 | Arthur Samuel menciptakan program catur yang bisa belajar |
| 1957 | Frank Rosenblatt menciptakan Perceptron |
| 1967 | Algoritma Nearest Neighbor dikembangkan |
| 1980s | Backpropagation dan Neural Networks berkembang |
| 1995 | Support Vector Machines (SVM) diperkenalkan |
| 1997 | Deep Blue mengalahkan juara catur dunia |
| 2006 | Geoffrey Hinton memperkenalkan Deep Learning |
| 2012 | AlexNet menang kompetisi ImageNet |
| 2016 | AlphaGo mengalahkan juara dunia Go |
| 2020+ | GPT-3, DALL-E, dan model AI generatif lainnya |
Model belajar dari data yang sudah memiliki label. Contoh: klasifikasi email spam, prediksi harga rumah.
Model belajar dari data tanpa label. Contoh: clustering customer, segmentasi pasar.
Model belajar dari interaksi dengan lingkungan melalui reward/punishment. Contoh: game AI, robot kontrol.
Kombinasi supervised dan unsupervised, menggunakan sedikit data berlabel dan banyak data tidak berlabel.
| Jenis | Input Data | Tujuan | Contoh Algoritma |
|---|---|---|---|
| Supervised | Berlabel | Prediksi | Linear Regression, SVM, Random Forest |
| Unsupervised | Tidak berlabel | Penemuan pola | K-Means, PCA, Apriori |
| Reinforcement | Reward signal | Pengambilan keputusan | Q-Learning, Deep Q Network |
pip install scikit-learn numpy matplotlib
import numpy as np from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier from sklearn.metrics import accuracy_score import matplotlib.pyplot as plt # Load dataset iris iris = datasets.load_iris() X = iris.data # Features y = iris.target # Labels # Split data menjadi training dan testing X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) # Buat model KNN knn = KNeighborsClassifier(n_neighbors=3) knn.fit(X_train, y_train) # Prediksi y_pred = knn.predict(X_test) # Evaluasi accuracy = accuracy_score(y_test, y_pred) print(f"Akurasi model: {accuracy:.2f}") # Tampilkan informasi dataset print(f"Jumlah data: {len(X)}") print(f"Jumlah fitur: {X.shape[1]}") print(f"Nama fitur: {iris.feature_names}") print(f"Nama target: {iris.target_names}") # Visualisasi sederhana plt.figure(figsize=(10, 6)) colors = ['red', 'green', 'blue'] for i, color in enumerate(colors): idx = np.where(y == i) plt.scatter(X[idx, 0], X[idx, 1], c=color, label=iris.target_names[i]) plt.xlabel('Sepal Length') plt.ylabel('Sepal Width') plt.legend() plt.title('Dataset Iris') plt.savefig('iris_plot.png') plt.show()
python ml_intro.py
| Konsep | Definisi | Contoh |
|---|---|---|
| Artificial Intelligence (AI) | Mesin yang dapat meniru kecerdasan manusia | Sistem pakar, chatbot, game AI |
| Machine Learning (ML) | Subset AI yang memungkinkan mesin belajar dari data | Prediksi, klasifikasi, clustering |
| Deep Learning (DL) | Subset ML menggunakan neural networks dengan banyak layer | Image recognition, NLP, self-driving cars |
def classify_fruit_rule_based(color, texture, weight): """Klasifikasi buah berdasarkan aturan manual""" if color == "red" and texture == "smooth": if weight < 100: return "Cherry" else: return "Apple" elif color == "yellow" and texture == "smooth": if weight > 100: return "Banana" else: return "Lemon" elif texture == "rough": return "Orange" else: return "Unknown" # Testing test_cases = [ ("red", "smooth", 80), ("red", "smooth", 150), ("yellow", "smooth", 120), ("yellow", "rough", 200), ] for color, texture, weight in test_cases: result = classify_fruit_rule_based(color, texture, weight) print(f"({color}, {texture}, {weight}g) -> {result}")
from sklearn.tree import DecisionTreeClassifier import pandas as pd import numpy as np # Buat dataset training data = { 'color': ['red', 'red', 'yellow', 'yellow', 'orange', 'green'], 'texture': ['smooth', 'smooth', 'smooth', 'rough', 'rough', 'smooth'], 'weight': [80, 150, 120, 200, 180, 90], 'fruit': ['cherry', 'apple', 'banana', 'orange', 'orange', 'apple'] } df = pd.DataFrame(data) # Encoding fitur kategorikal color_map = {'red': 0, 'yellow': 1, 'orange': 2, 'green': 3} texture_map = {'smooth': 0, 'rough': 1} fruit_map = {'cherry': 0, 'apple': 1, 'banana': 2, 'orange': 3} df['color_encoded'] = df['color'].map(color_map) df['texture_encoded'] = df['texture'].map(texture_map) df['fruit_encoded'] = df['fruit'].map(fruit_map) X = df[['color_encoded', 'texture_encoded', 'weight']] y = df['fruit_encoded'] # Train Decision Tree dt = DecisionTreeClassifier(max_depth=3, random_state=42) dt.fit(X, y) # Test dengan data baru new_samples = [ [color_map['red'], texture_map['smooth'], 80], [color_map['red'], texture_map['smooth'], 150], [color_map['yellow'], texture_map['smooth'], 120], [color_map['orange'], texture_map['rough'], 200], ] predictions = dt.predict(new_samples) fruit_names = {v: k for k, v in fruit_map.items()} print("\nHasil prediksi ML:") for i, sample in enumerate(new_samples): pred = predictions[i] print(f"Sample {i+1}: {fruit_names[pred]}") # Visualisasi Decision Tree from sklearn.tree import plot_tree plt.figure(figsize=(15, 8)) plot_tree(dt, feature_names=['color', 'texture', 'weight'], class_names=list(fruit_map.keys()), filled=True) plt.savefig('decision_tree.png')
| Bahasa | Library/Framework | Kelebihan | Kekurangan |
|---|---|---|---|
| Python | scikit-learn, TensorFlow, PyTorch, Keras | Mudah dipelajari, library lengkap, komunitas besar | Lambat untuk production, GIL |
| R | caret, randomForest, xgboost, tidyverse | Statistik kuat, visualisasi bagus | Learning curve curam, tidak untuk production |
| Java | Weka, Deeplearning4j, MOA | Enterprise ready, performa baik | Verbose, library terbatas |
| JavaScript | TensorFlow.js, Brain.js, ml5.js | Jalan di browser, real-time | Performa terbatas, library baru |
| C++ | TensorFlow C++, dlib, Shark | Performa maksimal | Sulit dipelajari, development lambat |
| Julia | Flux.jl, MLJ.jl | Cepat seperti C, mudah seperti Python | Komunitas kecil, library terbatas |
import numpy as np from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt # Data rumah: ukuran (m²) vs harga (juta) X = np.array([50, 60, 70, 80, 90, 100]).reshape(-1, 1) y = np.array([500, 600, 680, 750, 820, 880]) # Train model model = LinearRegression() model.fit(X, y) # Prediksi X_test = np.array([65, 85, 95]).reshape(-1, 1) y_pred = model.predict(X_test) print("Prediksi harga:") for size, price in zip(X_test.flatten(), y_pred): print(f"Rumah {size}m²: Rp {price:.0f} juta") print(f"\nKoefisien: {model.coef_[0]:.2f}") print(f"Intercept: {model.intercept_:.2f}") # Visualisasi plt.scatter(X, y, color='blue', label='Data aktual') plt.plot(X, model.predict(X), color='red', label='Garis regresi') plt.scatter(X_test, y_pred, color='green', s=100, label='Prediksi') plt.xlabel('Ukuran Rumah (m²)') plt.ylabel('Harga (juta Rp)') plt.legend() plt.grid(True) plt.savefig('linear_regression.png')
# Install packages jika belum ada # install.packages("ggplot2") # install.packages("caret") library(ggplot2) library(caret) # Load dataset iris data(iris) # Eksplorasi data summary(iris) str(iris) # Split data set.seed(123) trainIndex <- createDataPartition(iris$Species, p = .8, list = FALSE, times = 1) train <- iris[ trainIndex,] test <- iris[-trainIndex,] # Train Random Forest model <- train(Species ~ ., data = train, method = "rf", trControl = trainControl(method = "cv", number = 5)) # Prediksi predictions <- predict(model, test) # Evaluasi confusionMatrix(predictions, test$Species) # Visualisasi ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + geom_point(size = 3) + theme_minimal() + labs(title = "Iris Dataset Visualization")
import weka.core.Instances; import weka.core.converters.ConverterUtils.DataSource; import weka.classifiers.trees.J48; import weka.classifiers.Evaluation; import java.util.Random; public class WekaDemo { public static void main(String[] args) { try { // Load dataset DataSource source = new DataSource("data/iris.arff"); Instances data = source.getDataSet(); // Set class index (target variable) if (data.classIndex() == -1) data.setClassIndex(data.numAttributes() - 1); // Train J48 decision tree J48 tree = new J48(); tree.buildClassifier(data); // Print tree System.out.println(tree); // Cross-validation evaluation Evaluation eval = new Evaluation(data); eval.crossValidateModel(tree, data, 10, new Random(1)); // Print results System.out.println(eval.toSummaryString()); System.out.println(eval.toClassDetailsString()); System.out.println(eval.toMatrixString()); } catch (Exception e) { e.printStackTrace(); } } }
import time import numpy as np from sklearn import datasets from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split # Generate large dataset X, y = datasets.make_classification( n_samples=10000, n_features=20, n_informative=15, n_redundant=5, random_state=42 ) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 ) # Test berbagai jumlah trees n_trees = [10, 50, 100, 200, 500] results = [] for n in n_trees: start_time = time.time() rf = RandomForestClassifier(n_estimators=n, random_state=42, n_jobs=-1) rf.fit(X_train, y_train) train_time = time.time() - start_time accuracy = rf.score(X_test, y_test) results.append({ 'n_trees': n, 'accuracy': accuracy, 'train_time': train_time }) print(f"n_trees={n}: accuracy={accuracy:.4f}, time={train_time:.2f}s") # Visualisasi trade-off import matplotlib.pyplot as plt fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4)) times = [r['train_time'] for r in results] accs = [r['accuracy'] for r in results] ax1.plot(n_trees, times, 'b-o') ax1.set_xlabel('Number of Trees') ax1.set_ylabel('Training Time (seconds)') ax1.set_title('Training Time vs Number of Trees') ax1.grid(True) ax2.plot(n_trees, accs, 'r-o') ax2.set_xlabel('Number of Trees') ax2.set_ylabel('Accuracy') ax2.set_title('Accuracy vs Number of Trees') ax2.grid(True) plt.tight_layout() plt.savefig('rf_tradeoff.png')
| Library | Deskripsi | Use Case |
|---|---|---|
| TensorFlow.js | Porting TensorFlow ke JavaScript | Deep learning, transfer learning |
| Brain.js | Neural networks sederhana | Prediksi, klasifikasi ringan |
| ml5.js | High-level API berbasis TensorFlow.js | Pembelajaran, seni kreatif |
| Synaptic | Neural networks arsitektur bebas | Eksperimen arsitektur |
| Mind | Flexible neural networks | Prediksi sederhana |
<!DOCTYPE html> <html> <head> <title>TensorFlow.js Demo</title> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"></script> </head> <body> <h1>Machine Learning dengan TensorFlow.js</h1> <div id="output"></div> <script> // Membuat model sederhana async function createModel() { const model = tf.sequential(); // Input layer (1 neuron) model.add(tf.layers.dense({units: 1, inputShape: [1]})); // Compile model model.compile({ optimizer: 'sgd', loss: 'meanSquaredError' }); return model; } // Generate synthetic data function generateData() { const xs = tf.tensor2d([-1, 0, 1, 2, 3, 4], [6, 1]); const ys = tf.tensor2d([-3, -1, 1, 3, 5, 7], [6, 1]); return {xs, ys}; } // Train dan predict async function run() { const model = await createModel(); const {xs, ys} = generateData(); // Train model await model.fit(xs, ys, { epochs: 500, callbacks: { onEpochEnd: (epoch, log) => { if (epoch % 100 === 0) { console.log(`Epoch ${epoch}: loss = ${log.loss}`); } } } }); // Prediksi const testXs = tf.tensor2d([5, 6, 7], [3, 1]); const predictions = model.predict(testXs); // Tampilkan hasil const outputDiv = document.getElementById('output'); outputDiv.innerHTML = '<h2>Hasil Prediksi:</h2>'; predictions.data().then(function(data) { const inputs = [5, 6, 7]; for (let i = 0; i < inputs.length; i++) { outputDiv.innerHTML += `x = ${inputs[i]}, y = ${data[i].toFixed(2)}<br>`; } }); // Tampilkan weight model model.layers[0].getWeights()[0].print(); model.layers[0].getWeights()[1].print(); } run(); </script> </body> </html>
<!DOCTYPE html> <html> <head> <title>Brain.js Demo</title> <script src="https://cdn.jsdelivr.net/npm/brain.js@2.0.0/dist/brain-browser.min.js"></script> </head> <body> <h1>Klasifikasi dengan Brain.js</h1> <div id="results"></div> <script> // Buat neural network const net = new brain.NeuralNetwork({ hiddenLayers: [3], // 3 neuron di hidden layer activation: 'sigmoid' }); // Data training (XOR problem) const trainingData = [ { input: [0, 0], output: [0] }, { input: [0, 1], output: [1] }, { input: [1, 0], output: [1] }, { input: [1, 1], output: [0] } ]; // Train network console.log('Training dimulai...'); const stats = net.train(trainingData, { iterations: 20000, errorThresh: 0.005, log: true, logPeriod: 1000 }); console.log('Training selesai. Error: ' + stats.error); // Test network const testCases = [ [0, 0], [0, 1], [1, 0], [1, 1] ]; let output = '<h2>Hasil Prediksi XOR:</h2>'; output += '<table border="1">'; output += '<tr><th>Input 1</th><th>Input 2</th><th>Output</th></tr>'; testCases.forEach(function(test) { const result = net.run(test); output += `<tr><td>${test[0]}</td><td>${test[1]}</td><td>${Math.round(result)} (${result.toFixed(4)})</td></tr>`; }); output += '</table>'; // Tampilkan informasi training output += '<h3>Informasi Training:</h3>'; output += `Iterasi: ${stats.iterations}<br>`; output += `Error akhir: ${stats.error.toFixed(6)}<br>`; document.getElementById('results').innerHTML = output; // Visualisasi arsitektur network console.log('Arsitektur network:'); console.log(net.toJSON()); </script> </body> </html>
<!DOCTYPE html> <html> <head> <title>ml5.js Image Classification</title> <script src="https://unpkg.com/ml5@0.12.2/dist/ml5.min.js"></script> </head> <body> <h1>Image Classification dengan MobileNet</h1> <input type="file" id="imageUpload" accept="image/*"> <br><br> <img id="image" style="max-width: 400px; max-height: 300px;"> <h2>Hasil Klasifikasi:</h2> <div id="result"></div> <script> let classifier; let img; // Load model saat halaman dimuat classifier = ml5.imageClassifier('MobileNet', modelLoaded); function modelLoaded() { console.log('Model MobileNet siap digunakan!'); document.getElementById('result').innerHTML = 'Model siap. Silakan upload gambar.'; } // Event ketika user memilih file document.getElementById('imageUpload').addEventListener('change', function(event) { const file = event.target.files[0]; const reader = new FileReader(); reader.onload = function(e) { img = document.getElementById('image'); img.src = e.target.result; img.onload = function() { // Klasifikasi gambar classifier.classify(img, gotResult); } } reader.readAsDataURL(file); }); // Fungsi untuk menampilkan hasil klasifikasi function gotResult(error, results) { if (error) { console.error(error); return; } let output = ''; output += '<table border="1">'; output += '<tr><th>Label</th><th>Confidence</th></tr>'; results.forEach(function(result) { output += `<tr><td>${result.label}</td><td>${(result.confidence * 100).toFixed(2)}%</td></tr>`; }); output += '</table>'; document.getElementById('result').innerHTML = output; } </script> </body> </html>
<!DOCTYPE html> <html> <head> <title>Real-time Prediction</title> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"></script> <style> canvas { border: 1px solid black; } </style> </head> <body> <h1>Real-time Function Approximation</h1> <canvas id="canvas" width="600" height="400"></canvas> <br> <button onclick="startTraining()">Start Training</button> <button onclick="generateData()">Generate New Data</button> <div id="info"></div> <script> const canvas = document.getElementById('canvas'); const ctx = canvas.getContext('2d'); let model; let data = []; // Generate data dari fungsi sin function generateData() { data = []; for (let i = 0; i < 50; i++) { const x = (i / 50) * 4 * Math.PI - 2 * Math.PI; const y = Math.sin(x) + Math.random() * 0.2 - 0.1; data.push({x, y}); } drawData(); } // Draw data points function drawData() { ctx.clearRect(0, 0, canvas.width, canvas.height); // Draw axes ctx.beginPath(); ctx.strokeStyle = '#999'; ctx.moveTo(50, 0); ctx.lineTo(50, canvas.height); ctx.moveTo(0, canvas.height/2); ctx.lineTo(canvas.width, canvas.height/2); ctx.stroke(); // Draw data points ctx.fillStyle = 'red'; data.forEach(function(point) { const x = 50 + ((point.x + 2 * Math.PI) / (4 * Math.PI)) * (canvas.width - 100); const y = canvas.height/2 - point.y * 150; ctx.beginPath(); ctx.arc(x, y, 3, 0, 2 * Math.PI); ctx.fill(); }); // Draw model prediction if exists if (model) { drawPrediction(); } } // Draw model prediction async function drawPrediction() { ctx.beginPath(); ctx.strokeStyle = 'blue'; ctx.lineWidth = 2; for (let i = 0; i < 100; i++) { const x = (i / 100) * 4 * Math.PI - 2 * Math.PI; const tensorX = tf.tensor2d([x], [1, 1]); const prediction = model.predict(tensorX); const y = (await prediction.data())[0]; const canvasX = 50 + ((x + 2 * Math.PI) / (4 * Math.PI)) * (canvas.width - 100); const canvasY = canvas.height/2 - y * 150; if (i === 0) { ctx.moveTo(canvasX, canvasY); } else { ctx.lineTo(canvasX, canvasY); } } ctx.stroke(); } // Create model function createModel() { model = tf.sequential(); model.add(tf.layers.dense({units: 20, activation: 'relu', inputShape: [1]})); model.add(tf.layers.dense({units: 20, activation: 'relu'})); model.add(tf.layers.dense({units: 1})); model.compile({ optimizer: tf.train.adam(0.01), loss: 'meanSquaredError' }); } // Train model async function startTraining() { if (!model) createModel(); const xs = tf.tensor2d(data.map(d => [d.x])); const ys = tf.tensor2d(data.map(d => [d.y])); document.getElementById('info').innerHTML = 'Training...'; await model.fit(xs, ys, { epochs: 100, callbacks: { onEpochEnd: (epoch, logs) => { if (epoch % 10 === 0) { drawData(); console.log(`Epoch ${epoch}: loss = ${logs.loss}`); document.getElementById('info').innerHTML = `Epoch ${epoch}: loss = ${logs.loss.toFixed(4)}`; } } } }); drawData(); document.getElementById('info').innerHTML = 'Training selesai!'; } // Initialize generateData(); </script> </body> </html>